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O 

I By explicitly representing the reaction times of discrete chemical systems as the 



firing times of independent, unit rate Poisson processes we develop a new adaptive 
tau-leaping procedure. The procedure developed is novel in that accuracy is guar- 
anteed by performing post-leap checks. Because the representation we use separates 
the randomness of the model from the state of the system, we are able to perform 
the post-leap checks in such a way that the statistics of the sample paths generated 
O ! will not be skewed by the rejections of leaps. Further, since any leap condition is 



ensured with a probability of one, the simulation method naturally avoids negative 
population values. 



INTRODUCTION 



The procedure developed in this paper is a tau-leaping method for simulating the 
evolution of discrete stochastic chemical systems. The novelty of the procedure is 
that a post-leap check is performed after each step in order to guarantee accuracy. 
■ Post-leap checks have been avoided in the past because of the worry that rejecting 

leaps will skew the statistics of the sample paths. This problem is bypassed in our 
method by storing all the information gained during each leap for future use. By 
performing a post-leap check to ensure accuracy, the method developed in this paper 
naturally avoids negative population values without the need for any extra effort in 
either a programming or numerical sense. 

Consider a chemically reacting system consisting of > 1 chemical species, 
{Xi, . . . , Xat}, undergoing M > 1 chemical reactions, each of which is equipped with 
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a propensity function (or intensity function in the mathematics hterature), ak{X{t)), 
which is a function of the state of the system at time t, X{t) G Z>q. Let z/^, z/^ G Z>q 
be the vectors representing the number of molecules of each species consumed and 
created in the kth reaction, respectively. If Rk{t) is the number of times that the kth 
reaction has taken place up to time t, then the state of the system at time t is given 
by 

M 

X{t)=X{0) + Y,R,{t){u',-u,). 

k=l 

The fundamental assumption of stochastic chemical kinetics states that the prob- 
ability that reaction k takes place in the infinitesimal amount of time [t, t + At) is 
given by afc(X(t))At + 0{At'^)M That is, P{Rk{t + At) - Rk{t) = 1 | X{s), s<t) = 
ak{X{t))At + 0{At'^). For each k < M, let Yfc(-) be an independent, unit rate Poisson 
process. That is, for any T > 0, and small AT > 0, P{Yk{T + AT) - Yk{T) = 1) = 
AT + O(AT^), for j ^ k. Because the propensity function of reaction k is a^^iXii)) 
until the next reaction takes place, 

p(yJ akiX{s))ds] -yJ [ akiX{s))ds] = 1 | X{s), s < t 



/ v^o / 

2\ 



= ak{X{t))At + 0{Ar 

and we see that Rk{t) can be written as 

Rk{t)=Yk(^j\k{X{s))dsy (1) 

The state of the system at time t can therefore be represented as the solution to the 
following stochastic equation 



x{t) = x{o) + j2yk( / 

k=i ^-^0 



ak{X{s))ds {u'k - fk)- 



We note that even though the processes are independent, the terms 
Yfe {j^ ak{X{s))dsj are dependent because they depend upon the process X(s), for 
s < t? Equation ([2]) is typically called a random time change representation in the 
Mathematics literature.-'^i^'^ 

Note that there are M + 1 distinct time frames in ([2]). The first time frame is 
the actual, or absolute time, t. However, each Poisson process Yk brings its own 
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"internal" time frame. Equation ([T]) shows that at absolute time t, the amount of 
"internal time" that has passed for the process is Tk{t) = ak{X{s))ds. This 
observation leads to the following definition. 

Definition I.l. For each k < M, Tk{t) = afc(X(s))(is is the internal time of the 
Poisson process at absolute time t. 

We note that the values Tk{t) defined above do not actually have units of time. In 
fact, they are unit-less. However, they serve the purpose of allowing us to know where 
we are on the "time frames" of the Poisson processes at absolute time t. 

Remark. It is important to recognize that for any T2 > Ti > Tk{t), the increment 
yk(T2) — Yfc(Ti) is independent from the state of the system, X(t). This independence 
follows from the usual properties of Poisson processes and will eventually allow us to 
reject leaps without adding any bias to the system. 

The outline of the paper is as follows. In Section [III we will briefly introduce 
several exact simulation methods for chemical systems and a widely used approximate 
method known as tau-leaping. While all of the methods presented in Section [III are 
well known, we will consider each through the perspective of equations ([1]) and ([2]), 
which we believe lends insight. In Section IIIII we present our new adaptive tau- 
leaping procedure. In Section IIVI we compare the efficiency of our new algorithm to 
the current adaptive tau-leaping procedures on a model of a decaying dimer. 

II. BACKGROUND 

A. Exact simulation methods 

In order to generate sample paths for a given system all exact simulation methods 
attempt to answer each of the following two questions at a given moment of time, t: 

la. When does the next reaction take place? 

2a. Which reaction takes place at that future time? 

By answering both questions repeatedly, a sample path is constructed. We see from 
equation ([1]) that the above questions are equivalent to: 
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lb. What will be the absolute time of the next firing of the processes 
Y,{j;^a,{X{s))ds)7 

2b. Which process will fire at that time? 

Note that neither the state of the system, X{t), nor the propensity functions, 
ak{X{t)), change between reactions. Therefore, assuming that no other reaction fires 
first, the next firing time of the process Yk^J^ ak{X{s))ds) will be exponentially dis- 
tributed with parameter ak{X{t)). Of course, the logic used in the previous sentence is 
only valid up until the time of the first firing of the various processes, for at that time 
the state of the system, and hence the propensity functions, will change. Therefore, 
we may only conclude that the next firing time of the Processes YkiJ^ CLk{X(s))ds) will 
occur at the minimum time of the exponentially distributed random variables, and 
the reaction that takes place is simply the one associated with the realized minimum 
value. Repeated application of the above idea is the First Reaction Method.- The 
Next Reaction Method^ and modified Next Reaction Method^ use the same principles 
as the First Reaction Method except that by efficient use of information, both need 
to only generate one exponential random variable per iteration as opposed to the M 
needed in the First Reaction Method. See Ref. i4 for full details about how the Next 
Reaction Method and modified Next Reaction Method achieve this efficiency. The 
Gillespie Algorithm,-*^ or Stochastic Simulation Algorithm (SSA), answers the first 
question by using the fact that the minimum of M exponentially distributed random 
variables with parameters ak is exponentially distributed with parameter Ylk=i ^k- 
To answer the second question the Gillespie Algorithm uses the fact that the proba- 
bility that the jth exponential random variable achieves the minimum is a^/ Xlfcli '^fc- 
Therefore, for every iteration of the Gillespie Algorithm one random number is needed 
to find when the next reaction occurs, and one random number is needed to determine 
which reaction occurs at that later time. 

We note that the algorithms described above are considered exact simulation meth- 
ods because they generate statistically exact sample paths for the system ([2]). Typ- 
ically one wishes to use such methods to generate many sample paths in order to 
approximate the underlying probability distributions of the system of interest .-i^^iiii^ 
However, there are instances when the exactness of the methods make them ineffec- 
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tual and approximate techniques are needed. 



B. An approximate method: tau-leaping 

Because they simulate every reaction that takes place, statistically exact methods 
are slow for systems in which many reactions take place over short amounts of time. As 
the algorithms described in this paper are typically used for Monte Carlo simulations 
in which thousands, tens of thousands, or even hundreds of thousands of sample paths 
are needed to get an accurate picture of the underlying probability distributions, it is 
clear why simulation speed is critical. Therefore, approximate techniques have been 
developed that will generate sample paths significantly faster than the exact methods 
and will do so with an acceptable amount of error. One such method is tau-leaping.— 

Consider equations ([1]) and ([2]). We make the observation that there are two nat- 
ural places where we can approximate the system: the Poisson processes, Yk, and the 
propensity functions, a^. In standard tau-leaping, only the propensity functions are 
approximated. More specifically, if one assumes that ak{X{t)) is relatively constant 
in the time interval [t,t + r) (this assumption is typically called the leap condition), 
then, conditioned on X{s) for s < t, the number of times the kth reaction fires in the 
time interval [t, t + t) can be approximated by 

Number of firings = Rk(t + t) — Rk(t) 

= Yk (^j^ ak{x{s))ds^ - Yk (^j^ akix{s))ds^ (^3^ 

~ Yk (^ak{x(t))T + j ak{x{s))ds^ - Yk (^j ak{x{s))ds^ . 

Using that each Yk is a unit rate Poisson process then gives us that, conditioned on 
X{s) for s <t, 

Yk {ak{x{t))T + j ak{x{s))ds^ - Yk i^j afc(x(s))(is^ = Poisson(afc(x(t))r). (4) 

Therefore, we use Poisson random variables to approximate how many times each 
reaction has fired from time t to t + t, and we update the system via 

M 

x{t + r)=x{t) + ^NkWk-Vk), (5) 

k=l 



where Nk is a Poisson random variable with parameter ak{x{t))T. We note that based 
upon the approximation used in ([3]), tau- leaping is similar to an Euler method. 

The subtlety of tau-leaping is in selecting a r before each step so that the leap 
condition holds over the time interval [t,t -\- t). A typical way to make this explicit 
is to search for a r so that for some small e > 

\akiX{t + r)) - afc(X(t))| < max{eafc(X(t)), c^}, (6) 

where Ck is the rate constant for reaction k, which is the smallest amount that a 
propensity function can change. The question now becomes how to go about selecting 
the largest r for which we will be reasonably sure that the condition ([6]) will be 
satisfied. 

The tau-leaping method proposed by Cao et al.— chooses r before each step to be 
the largest value for which both the estimated mean and the estimated standard de- 
viation of the random variable on the left side of equation ([6]) satisfies that condition. 
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It is shown in Ref. 
the 2N quantities 

M 



that a computationally efficient way to do this is to compute 



/i,(X(t)) = - u,),a,{X{t)), z = l,...,N 

i=i 

M 

and then take r to be the value given by 

. f max{eXi{t)/gul} (max{eX^(t)/^„ 1})^ ] 

where gi for each species is a simple prescribed function of Xi{t) whose form is 
fixed at the beginning of the simulation and is given in Appendix |Al In computing r 
with the above method, the leap condition that is actually being satisfied is 



\Xi{t + r) - X,(t)| < max{eX,(t)/(7„ 1}, 
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for full details. 



which then approximately satisfies the leap condition ([6]). See Ref. 
The tau-leaping algorithm presented below chooses tau by a pre-leap computation 
using equation (I7[). 
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Algorithm 1. (Cao et al.— pre-leap computation tau- leaping) 

1. Initialize. Set the initial number of molecules of each species and set t = 0. 

2. Calculate the propensity function, a^, for each reaction. 

3. Calculate r according to equation ([7]). 

4. For each k < M, let = Poisson(afcr). 

5. Set X = X + J2h=i Nkii^'k - T^k) and t = t + T. 

6. Return to step [2] or quit. 

There are two technical features of standard tau-leaping that remain to be dis- 
cussed. The first feature is that during each iteration the algorithm should compute 
ao = Xlfcli and then do the tau leap only if the r calculated via equation (J?!) is 
larger than some small multiple of l/cto; but do one or more time steps with an exact 
simulation method otherwise. Switching between tau-leaping and an exact method 
is reasonable because the benefits of tau-leaping as compared with exact methods 
evaporate, and become negative, as r ^ 1/ao, which is the expected amount of time 
until the next reaction. 

The second feature is more subtle. For each leap, it is possible that the leap con- 
dition will be violated so badly that some population values will become negative. 
In fact, negative population values have been found to occur in simulations using 
tau-leaping on systems of interest .i^^iS As negative population values are physically 
unreasonable, this constitutes a problem, and a number of solutions have been pro- 
posed. Tian et al.— and Chatterjee et slM- independently developed a method in 
which binomial random variables, as opposed to Poisson random variables, are used 
to perform the leap. Because binomial random variables have bounded support, the 
parameters of the binomial random variable can be chosen in a way that guarantees 
no molecular species will become negative in the course of a leap. Cao et al.— then 
developed a method to handle the potential of negative population values in which 
the reactions are partitioned into two sets before the calculation of r: critical reac- 
tions and non-critical reactions. For some predetermined integer, nc, between 2 and 
20, the set of critical reactions is defined to consist of those reactions with a positive 
propensity function that is within Uc firings of exhausting one of its reactants. Having 
split the reactions in such a way before a leap, the algorithm performs a standard 
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tau-leap for the non-critical reactions concurrent with a standard Gillespie Algorithm 
step for the critical reactions. It is guaranteed that among all the critical reactions 
there will be at most one firing during the leap, thereby significantly reducing, but not 
completely doing away with, the chance of achieving a negative value. If a negative 
population value is still achieved, the tau is shortened and the leap is repeated. See 
Ref. Il5| for the full details of this method. 

While both the binomial tau-leaping method and the "critical reaction method" 
guarantee negative population values will be avoided, neither addresses the underlying 
problem of what is driving population values negative: that the leap condition is badly 
violated at times. Instead of handling this larger problem, both the binomial tau- 
leap method and the partitioning method of Cao et al. only handle it when species 
numbers are low (although this is admittedly the most important time to handle this 
problem). Also, the fact that population values can become negative in the absence 
of specific machinery designed to keep them positive points out that other such large 
violations of the leap condition are most likely occurring elsewhere in the simulation, 
yet are going unnoticed. 



III. A NEW TAU-LEAPING PROCEDURE 



Through a post-leap check the procedure developed in this section will only accept 
leaps that demonstrably satisfy a leap condition. A consequence of such enforcement 
will be that achieving negative population values will be impossible, and so the par- 
titioning machinery of Cao et al. will no longer be necessary. Further, as the method 
proposed will adaptively choose tau based upon the success or failure of the previous 
leap, there will be no need to calculate tau before each leap via equation ([7]). 



A. Conceptual framework 

The method proposed in this section relies heavily on the following two facts: 1) 
the internal time frames of the Poisson processes are distinct from each other and from 
the absolute time frame and 2) for T2 > Ti > Tk(t), the interval Yk{T2) — Yk(Ti) is 
independent from the state of the system X(t). Consider equations ([1]) and ([2]). Sup- 
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pose that at time t we have knowledge of the state of the system, X{t), the propensity 
functions, = ak{X{t)), the various internal times Tk = Tk{t) = J^^ ak{X{s))ds, and 
the number of firings of each Poisson process up to time t, Ck = Yfc(Tjfc(t)). How- 
ever, we suppose we have no information about the processes Yk{T) — Yk(Tk) for 
T > Tj.. At this time we attempt to perform a leap with some pre-determined r. By 
equation (j4]), the number of jumps of Yk over the internal time period [Tk,Tk + aur) 
has a Poisson distribution with parameter a^r. We therefore generate M Poisson 
random variables and denote them by N^. Note that we have now fixed the value 
Yk{Tk + cifcr) = Nk + Ck for the course of the simulation. We next approximate the 
state of the system at time t + r via equation ^ and check the leap condition. If we 
verify that the leap condition has been satisfied we may accept the updated system 
and attempt another leap. 

If the leap condition is not satisfied we do not accept the leap, and we do not 
update the system. Instead, we decrease the tau value by choosing some r* < r 
and attempt another leap over this shorter time period. However, we still know that 
Yk{Tk + Ofcr) = Ck + Nk, for each k, and should condition upon this knowledge when 
calculating Yk{Tk + dk^*)- Note that knowing Yk{Tk + flfcr) = Ck + Nk is not the same 
as claiming to know that reaction k fired Ck + A^^^ times by time t + t. The former 
equation is simply a statement about the values of the Poisson process Vfc(-), while 
the latter is a statement about the actual firings of the system by a certain time. We 
prove the following theorem in the appendix. 

Theorem III.l. Let Y{t) he a Poisson process with intensity \, and let < s < u < 

t. Then, conditioned on Y{s) and Y{t), Y{u) — Y{s) has a binomial {Y{t) — Y{s),r) 
distribution, where r = {u — s)/{t — s). 

By Theorem lIII.il the distribution of Yk(Tk + akT*) — Yk(Tk), conditioned on Yk(Tk + 
akT) = Ck + Nk, has a hmomial{Nk,Pk) distribution, where pk = t*/t. After choosing 
the number of times Yk jumps in the internal time period [Tk,Tk + akT*) according 
to the binomial distribution just calculated we repeat the process of attempting an 
update of the state of the system and of checking the leap condition. If this leap 
is also rejected, we simply store the information of how many times Yk jumped by 
internal time Tk + akT*, shorten tau, and try again. For the next attempted leap we 
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only need to condition on Yk(Tk) and Yk(Tk + a^r*) because of the independence of 
intervals of Poisson processes. Eventually, a leap will be accepted and we may move 
forward in both absolute and internal time. 

We note that when we accept a leap and are ready to attempt another one we 
may have stored the value of Yk{T) for many different internal times, T >Tk. When 
we attempt another leap, the next proposed internal time will either fall between two 
internal times we have stored, or will fall beyond our last stored internal time. In the 
former case. Theorem IIII.ll may be applied because of the independence of intervals 
of Poisson processes. In the latter case, the number of firings will be given as the 
number of firings up to the last stored internal time, plus a Poisson random variable 
accounting for the extra internal time. 

The above description is the backbone of our new method. At each absolute time 
t we will attempt a leap of size r. Supposing that we have stored the information 
T\ . . . , and Yk{T^), ^(T^), with T\ . . . , > T^, Tk + atr will either fall 
between two of the stored internal times or fall beyond T'^. As above, in the case 
when < Tfc + OfcT < T'+^ for some i, we apply Theorem IIII.ll by conditioning 
upon Yfc(T*) and Yfc(T*"'"^) and choosing from the appropriate binomial distribution. 
Finally, we add the random variable chosen to Yk{T'^) —Ck to find how many times Y^ 
jumped between the internal times and + a^r. In the case when + a^r > T'^ 
we generate a Poisson random variable with parameter + a^r — T'^ and add it to 
Yfc(T'^) — Ck to find the number of jumps. Acceptance or rejection of the leap then 
depends upon checking the leap condition. If we do reject the leap, we would then 
store the information just learned about each Yk, shorten r, and try again. By storing 
the information gained about the processes Yk after each attempted leap we see that 
no information about the processes Yk will be lost in the course of a simulation. 
Because all of the randomness in the system resides in the processes Yk, we conclude 
that the rejection of leaps will not skew the statistics of the sample paths. 

We have not yet described how to update r after each failed or accepted leap and 
will do so now. In the following we suppose that we have already fixed the e of the 
leap condition ([8]) as e. 

Tau updating procedure: 
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1. If a leap is rejected because it fails the leap condition for e = e, then decrease 
T by multiplying it by some p < 1. 

2. If a leap is accepted because it satisfied the leap condition for e — e, but would 
have failed the leap condition if e = 3e/4, then decrease r by multiplying it by 
some p* that satisfies p < p* < 1. 

3. If a leap is accepted because it satisfied the leap condition for e — e, and would 
have satisfied the leap condition if e = 3e/4, then increase r by raising it to the 
power q for some < q < 1. 

Unhke the method of Cao et al., we are not making an effort to select the largest 
possible T for which the leap condition will hold. However, based upon the tau 
updating procedure given above, it should be clear that we are attempting to select 
a tau that is at least near such a maximal value. However, as the value of e itself is 
rather arbitrary, it docs not seem critical to select such a "largest" tau. 

We point out that Step 2 in our tau updating procedure is useful to keep the 
number of rejected leaps down. In essence. Step 2 forces the algorithm to always 
attempt to satisfy the leap condition for a smaller value of epsilon than what was 
originally chosen, but does not reject the leap if such a restrictive condition is not met. 
Because the generation of Poisson and binomial random variables are computationally 
intensive procedures, attempting to limit the number of failed leaps, and hence the 
number of random variables generated, seems reasonable. 

B. The new algorithm 

The analysis of the previous section gives us a new adaptive tau-leaping procedure. 
Before presenting the algorithm, however, some notation is needed. Each Poisson 
process Yfc will have an associated matrix, Sk, that will serve to store the information 
gained from leaps that fail the post leap check. Each Sk has two columns. The first 
column will store internal times (as opposed to absolute times). The second column 
will store the number of firings of Yj; up to the internal time in the first column. 
That is, the elements of row i satisfy Yk{Sk{i, 1)) = Sk{i, 2). The first row of Sk will 
always contain the present internal time and the number of times has fired up to 
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that time. Also, Tk = Tk(t) will always denote the current internal time of Yk and 
YkiTk) will be denoted by Ck- Combining the above gives that at each step we have 
Tk = Sk{l, 1) and Ck = Sk{l, 2). Finally, the values rowk will be used to update the 
rows of the matrices 5*^ after every step. 

Algorithm 2. (Post-leap check tau- leaping) 

1. Initialize. Set the initial number of molecules of each species, x G Z>o, and 
calculate the propensity functions, a^. Set t = and for each k set Tk = Ck = 0, 
and Sk = [0 , 0]. Calculate r via equation ([7]). Set < p < p* < 1 and 
< g < 1. 

2. For each k do the following: 

(a) Let Bk = the number of rows of Sk- 

(b) UakT + Tk > SkiBkA), 

• Set Nk = Poisson(Tfc + a^r - SkiBk, 1)) + Sk{Bk, 2) - Ck- 

• Set rowk = Bk- 

(c) else 

• Find the index, Ik, such that 

Skih - 1, 1) < Tfc + OfcT < 5*^(4, 1). 

• Set r = {Tk + akT - Skih - 1, 1))/ {Sk{h, 1) - - 1, 1)) • 

• Set Nk = binomial(5fc(4, 2) - 5^(4 - 1, 2), r) + Skih - 1, 2) - C^. 

• Set rowk = 4 — 1. 

3. Check whether the leap condition holds with the selected Nk. 

4. If yes, accept leap: 

(a) Update each Sk- 

• Delete all rows less than or equal to rowk and shift all other rows 
down. Add a new first row of [Tk + akT , Ck + Nk]. 

(b) Set t = t + T. 

(c) For each k, set Tk = Tk + akT and Ck = Ck + Nk. 
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(d) Update r according to the tau updating procedure of the previous section. 

(e) Set X = X + J2k=i ^ki^k ~ ^k) and recalculate the propensity functions. 

(f ) Return to step [2l 
5. Else, reject leap: 

(a) Update each 5*^. 

• Add the row [T^ + a^r , Ck + N^] between rows rowk and rowk + 1 (if 
rowk + 1 > Bk, just add a last row to Sk). 

(b) Decrease r by setting r = pr. 

(c) Return to step [2J 

Remark. In the above algorithm no population value can become negative after an 
accepted leap. Unlike the binomial tau-leap method or the splitting of the reactions 
into critical and non-critical subsets, however. Algorithm [2] handles the underlying 
problem that could cause negative population values. That is, the leap condition will 
never be violated. 

We note that the manner in which we choose our r will generally lead to smaller 
r values than the pre-leap computation method for a given value of e and for a given 
state of the system X{t). Therefore, for a given e we expect that our method will 
need more simulation time than Algorithm [1], but will produce more accurate results. 
Thus, in order to find which algorithm is more efficient we will need to compare them 
with different e values. Also, we note that Algorithm [T] chooses its tau values based 
upon the current state of the system whereas Algorithm [2] chooses its tau values based 
upon the success or failure of the previously attempted leap. While it is true that 
each method will (at least statistically) produce the same leap for a given state of 
the system and a given r, we note that over the course of an entire simulation the 
difference in how each algorithm selects their tau values will cause the statistics of 
the sample paths to diverge. Therefore it is entirely plausible that one method will 
achieve higher accuracy than the other through fewer steps. This is demonstrated in 
Section lYl 
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C. Switching to an exact algorithm 



As noted in the paragraph following Algorithm [H it is sometimes necessary to 
switch between a tau-leaping method and an exact method. Doing so for Algorithm 
[2] is non-trivial as there could be stored future information in the matrices Sk- If any 
of the Sk matrices do have stored future information, then the distributions of the 
future states of the system do not solely depend upon the current state of the system. 
A choice must therefore be made as to how to make the switch in Algorithm [2l One 
option is to discard all stored future information (that is, delete the information in the 
matrices Sk) and switch to an exact method. In this case, it is important to recognize 
that by discarding the stored future information, we have, in effect, changed the 
Poisson processes, Yk, and such a change adds a potential bias to the choice of sample 
paths. For example, it is possible that the algorithm needed to switch to an exact 
method because one or more of the processes Y^. had significantly more firings than 
was expected over a short period of internal time. By discarding this stored future 
information, we may be inadvertently biasing the system away from such instances. 
Typically, however, there are not many switches made from tau-leaping to an exact 
method in the course of a simulation, and so the bias that is added may be negligible. 
Further, doing so would be very simple to implement. 

The other option is to keep all of the stored future information and still switch to 
an exact method. In this case, we preserve the fact that we are not approximating the 
processes Y^ in our simulation. The exact method to which we will switch is similar 
to the modified Next Reaction Method. See Ref. 14 for a detailed explanation of the 
modified Next Reaction Method. 

As in the modified Next Reaction Method, we begin by letting Pj. denote the 
internal time of the next firing of the Poisson process Y^. That is, for each k we 
let Pk = min{T > | Yk(T) > Yfc(Tfe)}. There are two cases to consider in the 
calculation of Pk. 

Case 1: If there is no stored information for the A;th reaction we may use the fact 
that the Y^s are independent, unit rate Poisson processes and set Pk = -|- ln(l/rfc), 
where is uniform(0, 1). 

Case 2: If there is stored information for the kth reaction channel we must condition 
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upon that information in order to calculate the distribution of the next firing time. 
In the Appendix we show the following. 

Lemma III. 2. Let Y(t) be a Poisson process with intensity A. Let t > and = 
F(0) < Y{t) = N. Let = mm{s \ Y{s) > 0}. Then, P{P^ > r \ Y{t) = N) = 
(1-r/t)^. 

We let Bk, Cki and be as in Algorithm [21 By Lemma [III. 21 we may calculate 
Pk in the following manner. First, find the index j such that Ck = Sk{j,2) and 
Ck < Sk{j + 1,2). Thus, the next firing of Yk happens in the internal time interval 
[Sk{j,l), Sk{j + 1,1))- If no such j exists, set Pk = Sk{Bk,l) + ln(l/rfc), where 
Tfc is uniform(0, 1). If such a j does exist, let t^ = Sk{j + 1,1) — Sk{j, 1) and let 
Nk = Sk{j + 1, 2) — Sk{j, 2). Nk gives the number of firings of the Poisson process Yk 
in the internal time period [Sk{j, 1), Sk{j + 1, 1)), which is of length t^. By Lemma 
IIII.2I Pk — Sk{j, 1) has distribution function (1 — r/tk)^'' and so we may set Pk = 
^fc(j,l)+tfc(l-r;/^^), where Tk is uniform(0, 1). 

Algorithm 3. An exact stochastic simulation algorithm given stored future infor- 
mation. 

1. Input: the number of molecules of each species, t, and for each k, the following 
from Algorithm [2] 5*^, Tk, and Ck- 

2. For each k, find Pk as described in the previous paragraph. 

3. Set Afc = {Pk - Tk)/ak. 

4. Set A = min{Ajt} and let A^ be the value at which the minimum is realized. 

5. Set t = t + A and update the number of each molecular species according to 
reaction /i. 

6. For each k, set Tk = Tk + cik^- 

7. Set C^ = C^ + 1. 

8. Find as described in the previous paragraph. 

9. Recalculate the propensity functions. 

10. Update each Sk by deleting all rows with internal times less than or equal to 
Tk and adding a new first row of [Tk , Ck]- 

11. Return to step [3] or return to tau-leaping. 
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Remark. After the first time step, the above algorithm uses only one random variable 
per time step. Also, Algorithm [3] becomes the modified Next Reaction Method if all 
of the information in each Sj. is exhausted before the switch back to tau-leaping is 
made. 

We have shown how to switch successfully from tau-leaping with Algorithm [2] to 
the exact method in Algorithm [31 However, we now have to consider how to switch 
back. It is not instantly clear that we can simply discard the information contained 
in the unused Pk values without adding bias to our system. We also note that we can 
not simply incorporate the Pk values into the 5*^ matrices because each 5*^ contains 
information about how many jumps of Yk have taken place up to different internal 
times and not information about the exact iump times. However, the following theo- 
rem allows us to discard the information stored in the Pk values when we switch from 
Algorithm [3] back to Algorithm [21 The proof can be found in the appendix. 

Theorem III. 3. The statistics of the firing times of each reaction channel are unaf- 
fected by discarding the Pk values when we switch from the exact method of Algorithm 
[3 to tau-leaping in Algorithmic 



IV. A NUMERICAL EXAMPLE 



We compare the different tau-leaping methods on a model of an unstable dimer 
that has been used in a number of earlier papers.— i^ii^ The model consists of four 
reactions and three species. The reactions are 

^ X2 ^ 2Xi 

(9) 

o C2 

ZAi — > A.2 A2 — ^ A3, 

with rate constant Ci = 1, C2 = .002, C3 = 0.5, and C4 = 0.04. The propensity 
functions are ai(X) = Xi, a2{X) = (.002/2)Xi(Xi-l), a3{X) = O.5X2, anda4(X) = 
0.04X2. We chose initial conditions of Xi(0) = 10^ X2(0) = 10^, and X3(0) = 0. We 
imposed the leap condition (jSD in Algorithm [H by calculating tau before each leap 
according to equation ((Tj) and imposed the leap condition in Algorithm [21 by checking 
that the condition was satisfied after every leap. For Algorithm [21 we chose p = .75, 
p* = .9, and q = .98 as the values for our tau updating procedure described at the 
end of Section IIII A[ 
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For e = 0.17, e = 0.1, and e = 0.03 as the e values of the leap condition ([8]), 
we simulated the above system 10^ times using Gillespie's original algorithm (SSA), 
Algorithm [H and Algorithm [21 The simulations were performed using Matlab on 
a 2 Ghz processor running on the Debian operating system. To generate Poisson 
and binomial random variables, we used the Matlab Poisson and binomial random 
number generators. Each simulation began at time t = and ended at time t = 1. In 
Figure 1 we show the histograms of Xi{l) and X2{1) for the different values of e. We 
see that for each e Algorithm [2] is significantly more accurate than Algorithm [H As 
was pointed out at the end of Section IIIIBl however, the tau selection strategy for 
Algorithm [2] will naturally choose smaller values of r for a given value of e. Therefore, 
it is not surprising that Algorithm [2] is significantly more accurate for each e. The 
CPU times for the different methods and different e values are given in the following 
table. 



CPU Times 


e = 0.17 


e = 0.10 


e = 0.03 


Alg. □ 
Alg. [2] 


31 CPU Minutes 
60 CPU Minutes 


46 CPU Minutes 
82 CPU Minutes 


120 CPU Minutes 
226 CPU Minutes 



As was predicted in Section IIII Bl the increased accuracy of Algorithm [2] for each e 
came at the price of longer CPU times. 

In Figure 2 we plot the histograms of Xi(l) and ^2(1) of the Gillespie Algorithm 
(SSA), Algorithm [U with e = 0.03, and Algorithm [2] with e = 0.1. We see that the 
histograms of the different tau-leaping methods with their respective e values are now 
nearly equivalent in their accuracy. However, based upon the CPU times in the above 
table, in order to get such accuracy Algorithm [1] required 46% more CPU time than 
Algorithm [21 We therefore see that, at least for this model, our new post-leap check 
method is more efficient than the pre-leap computation method. 

We next simulated the system 100 times using Algorithm [H with e = 0.03 and 
Algorithm [2] with e = 0.1, except this time we added machinery to keep track of 
how many leaps were needed in each algorithm to complete the simulation. We 
found that Algorithm [21 took an average of 203.4 successful leaps per simulation and 
rejected an average of 37.8 leaps. We found that Algorithm [H took an average of 466.8 
leaps. Therefore, Algorithm [21 needed 56% fewer successful leaps than Algorithm [H 
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which imphes that the average size of tau for the successful leaps of Algorithm [2] 
was approximately double that of Algorithm [H The reason Algorithm [2] can achieve 
similar accuracy to Algorithm [T] with larger average tau values was explained at the 
end of Section IIIIBI 

V. DISCUSSION 

By explicitly representing the reaction times of discrete stochastic chemical systems 
as the firing times of independent, unit rate Poisson processes, we have developed an 
accurate and efficient adaptive tau-leaping procedure. The main difference between 
the method developed in this paper and the current adaptive tau-leaping methods is 
that we enforce our leap conditions via a post-leap check. Also, we have demonstrated 
how to reject leaps without affecting the statistics of the sample paths generated. 
Further, as a consequence of always satisfying a given leap condition, our procedure 
is guaranteed to never produce negative population values, which is in contrast to 
current methods in which extra machinery is needed to guarantee that population 
values remain non-negative. Finally, through an example of an unstable dimer, we 
have demonstrated that the method proposed in this paper is not only extremely 
accurate, but is also extremely efficient. 

Algorithm [2] will surely not be more efficient than Algorithm [1] for all chemical 
reaction systems. However, by enforcing a post-leap check in such a way that the 
statistics of the sample paths are not skewed by the rejection of a leap, we are confident 
that Algorithm [2] will be more accurate and will have better stability properties for 
all chemical reaction systems. We also note that the method we have developed is 
easily adaptable to any leap conditions, not just those given by and ([HD. Such 
adaptability is in contrast to methods that compute r before each leap, which need 
a new tau computation procedure for every new leap condition. 

This paper represents only a first step in developing and analyzing tau-leaping 
methods through an understanding of the random time change representation given 
in equation (l2l). Future work will focus on error analysis and the development of 
methods that will achieve greater accuracy through fewer steps. 
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APPENDIX A: DEFINITION OF FUNCTIONS 

The functions gi = gi{Xit)) used in equation ([7]) are defined as follows. Let HOR(i) 
denote the highest order of reaction in which species Xj appears as a reactant. 

(i) If HOR(i) = 1, take gi = 1. 

(ii) If HOR(z) = 2, take g^ = 2, except if any second-order reaction requires two Xi 
molecules in which case take gi = 2 + l/{Xi{t) — 1). 

(iii) If H0R(2) = 3, take g^ = 3, except if some third-order reaction requires two Xi 
molecules in which case take 

3 / 1 
9^ = 7;[^ + 



2\ 

except if some third-order reaction requires three Xi molecules in which case 
take 

1 2 



X,(t)-1 X,(t)-2- 



APPENDIX B: PROOFS 



Proof, (of Theorem IIII.1|) Without loss of generality, we suppose that s = and 



r(0) = 0. Let Y{t) = X and < M < t. Then 

P{Y{u) = J I F(t) = X) = P{Y{u) = J, Y{t) = N)/P{Y{t) = X) 

= P{Y{t) - Y{u) = X - 3)PiY{u) = j)/P{Y{t) = X) 

_ e-^(*-")(A(t - u))^'^ e-^'^iXuy X! 
~ (X-j)! j! e-^*(At)^ 



□ 



19 



Proof, (of Lemma IIII.2P 

P(pi > r I Y{t) = N) = P{Y{r) = | Y{t) = N) 

= P{Y{r) = 0,Y{t) = N)/P{Y{t) = N) 

= PiY{t) - Y{r) = N)P{Y{r) = 0)/P{Y{t) = N) 
_ e-^(^-'-)(A(t-r))^ m 

= (1 - r/tf. 

□ 

Proof, (of Theorem IIII.3I) If there are no stored reaction times for reaction channel 
k, then discarding the extra information contained in is done by invoking the 
loss of memory property. Now suppose that there is stored information for reaction 
k. The firing times less than T > of a Poisson process Y, conditioned on Y{T), 
are uniform(0,T) random variables. Therefore, it is sufficient to show that for a 
uniform(a, b) random number U, P{U > s + x \ U > x) = {b — {s + x))/{b — x), thus 
showing that the conditional statistics are uniform(a;, 6). The calculation is simple 
and is omitted. □ 
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Captions 



Caption for Figure 1. 

Histogram plots of Xi{l) and X2{1) for e = 0.03,0.10,0.17. Each plot was gener- 
ated by simulating the system 10^ times using the SSA (dashed curve with 'x'). 
Algorithm 1 (solid curve with 'o'), and Algorithmic] (solid curve with 'A'). Algorithm 
[2] is consistently more accurate than Algorithm (H as was expected. 



Caption for Figure 2. 

Histogram plots of Xi{l) and ^2(1) found from 10^ simulations of system ([9]). 
We show the SSA (dashed curve with 'x'). Algorithm [1] with e = 0.03 (solid curve 
with 'o'), and Algorithm [2] with e = 0.1 (solid curve with 'A'). The distributions of 
the sample paths generated by Algorithms [T] and [2] with these e values have similar 
accuracies and can therefore be used as a fair test for efficiency. 



Caption for Table on page 17. 

CPU times needed for Algorithms 1 and 2 to complete 10^ simulations of system 
([9]) for different e values. 
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